<!DOCTYPE html>
<html lang="" xml:lang="">
  <head>
    <title>Scientific studies and confounding</title>
    <meta charset="utf-8" />
    <meta name="author" content="datasciencebox.org" />
    <script src="libs/header-attrs/header-attrs.js"></script>
    <link href="libs/font-awesome/css/all.css" rel="stylesheet" />
    <link href="libs/font-awesome/css/v4-shims.css" rel="stylesheet" />
    <link href="libs/panelset/panelset.css" rel="stylesheet" />
    <script src="libs/panelset/panelset.js"></script>
    <script src="libs/kePrint/kePrint.js"></script>
    <link href="libs/lightable/lightable.css" rel="stylesheet" />
    <link rel="stylesheet" href="../xaringan-themer.css" type="text/css" />
    <link rel="stylesheet" href="../slides.css" type="text/css" />
  </head>
  <body>
    <textarea id="source">
class: center, middle, inverse, title-slide

.title[
# Scientific studies and confounding
]
.subtitle[
## <br><br> Data Science in a Box
]
.author[
### <a href="https://datasciencebox.org/">datasciencebox.org</a>
]

---





layout: true
  
&lt;div class="my-footer"&gt;
&lt;span&gt;
&lt;a href="https://datasciencebox.org" target="_blank"&gt;datasciencebox.org&lt;/a&gt;
&lt;/span&gt;
&lt;/div&gt; 

---



class: middle

# Scientific studies

---

## Scientific studies

.pull-left[
**Observational**  
- Collect data in a way that does not interfere with how the data arise ("observe")
- Establish associations
]
.pull-right[
**Experimental**  
- Randomly assign subjects to treatments
- Establish causal connections
]

---

.question[
What type of study is the following, observational or experiment? What does that mean in terms of causal conclusions?

*Researchers studying the relationship between exercising and energy levels asked participants in their study how many times a week they exercise and whether they have high or low energy when they wake up in the morning.*

*Based on responses to the exercise question the researchers grouped people into three categories (no exercise, exercise 1-3 times a week, and exercise more than 3 times a week).* 

*The researchers then compared the proportions of people who said they have high energy in the mornings across the three exercise categories.*
]

---

.question[
What type of study is the following, observational or experiment? What does that mean in terms of causal conclusions?

*Researchers studying the relationship between exercising and energy levels randomly assigned participants in their study into three groups: no exercise, exercise 1-3 times a week, and exercise more than 3 times a week.* 

*After one week, participants were asked whether they have high or low energy when they wake up in the morning.*

*The researchers then compared the proportions of people who said they have high energy in the mornings across the three exercise categories.*
]

---

class: middle

# Case study: Breakfast cereal keeps girls slim

---


.midi[
&gt; *Girls who ate breakfast of any type had a lower average body mass index (BMI), a common obesity gauge, than those who said they didn't. The index was even lower for girls who said they ate cereal for breakfast, according to findings of the study conducted by the Maryland Medical Research Institute with funding from the National Institutes of Health (NIH) and cereal-maker General Mills.* [...]
&gt;
&gt; *The results were gleaned from a larger NIH survey of 2,379 girls in California, Ohio, and Maryland who were tracked between the ages of 9 and 19.* [...]
&gt;
&gt;*As part of the survey, the girls were asked once a year what they had eaten during the previous three days.* [...]
]

.footnote[
Souce: [Study: Cereal Keeps Girls Slim](https://www.cbsnews.com/news/study-cereal-keeps-girls-slim/), Retrieved Sep 13, 2018.
]

---

## Explanatory and response variables

- Explanatory variable: Whether the participant ate breakfast or not

- Reponse variable: BMI of the participant


---

## Three possible explanations

--

1. Eating breakfast causes girls to be slimmer 


--
2. Being slim causes girls to eat breakfast


--
3. A third variable is responsible for both -- a **confounding** variable: an extraneous variable that affects both the explanatory and the response variable, and that makes it seem like there is a relationship between them

---

## Correlation != causation

&lt;img src="img/xkcdcorrelation.png" width="80%" height="50%" style="display: block; margin: auto;" /&gt;

.footnote[
Randall Munroe CC BY-NC 2.5 http://xkcd.com/552/
]

---

## Studies and conclusions

&lt;img src="img/random_sample_assign_grid.png" width="80%" height="50%" style="display: block; margin: auto;" /&gt;

---

class: middle

# Case study: Climate change survey

---

## Survey question

&gt;A July 2019 YouGov survey asked 1633 GB and 1333 USA randomly selected adults 
which of the following statements about the global environment best describes 
their view:
&gt;
&gt;- The climate is changing and human activity is mainly responsible  
&gt;- The climate is changing and human activity is partly responsible, together with other factors  
&gt;- The climate is changing but human activity is not responsible at all  
&gt;- The climate is not changing  

---

## Survey data

&lt;br&gt;

.small[


&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style="text-align:left;"&gt;   &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is mainly responsible &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is partly responsible, together with other factors &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing but human activity is not responsible at all &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is not changing &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Don't know &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Sum &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; GB &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 833 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 604 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 49 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 33 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 114 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1633 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; US &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 507 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 493 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 120 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 80 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 133 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1333 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; Sum &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1340 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1097 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 169 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 113 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 247 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 2966 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
]

.footnote[
Source: [YouGov - International Climate Change Survey](https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/epjj0nusce/YouGov%20-%20International%20climate%20change%20survey.pdf)
]

---

.question[
What percent of **all respondents** think the climate is changing and  
human activity is mainly responsible?  
]

.small[
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style="text-align:left;"&gt;   &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is mainly responsible &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is partly responsible, together with other factors &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing but human activity is not responsible at all &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is not changing &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Don't know &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Sum &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; GB &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 833 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 604 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 49 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 33 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 114 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1633 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; US &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 507 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 493 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 120 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 80 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 133 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1333 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; Sum &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1340 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1097 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 169 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 113 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 247 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 2966 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
]

--


```r
(all &lt;- 1340 / 2966)
```

```
## [1] 0.4517869
```


---

.question[
What percent of **GB respondents** think the climate is changing and  
human activity is mainly responsible?  
]

.small[
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style="text-align:left;"&gt;   &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is mainly responsible &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is partly responsible, together with other factors &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing but human activity is not responsible at all &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is not changing &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Don't know &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Sum &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; GB &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 833 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 604 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 49 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 33 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 114 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1633 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; US &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 507 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 493 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 120 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 80 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 133 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1333 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; Sum &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1340 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1097 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 169 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 113 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 247 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 2966 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
]

--


```r
(gb &lt;- 833 / 1633)
```

```
## [1] 0.5101041
```

---

.question[
What percent of **US respondents** think the climate is changing and  
human activity is mainly responsible?  
]

.small[
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style="text-align:left;"&gt;   &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is mainly responsible &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing and human activity is partly responsible, together with other factors &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is changing but human activity is not responsible at all &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; The climate is not changing &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Don't know &lt;/th&gt;
   &lt;th style="text-align:right;"&gt; Sum &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; GB &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 833 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 604 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 49 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 33 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 114 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1633 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; US &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 507 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 493 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 120 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 80 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 133 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 1333 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style="text-align:left;"&gt; Sum &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1340 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 1097 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 169 &lt;/td&gt;
   &lt;td style="text-align:right;width: 0.5 in; "&gt; 113 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 247 &lt;/td&gt;
   &lt;td style="text-align:right;"&gt; 2966 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
]

--


```r
(us &lt;- 507 / 1333)
```

```
## [1] 0.3803451
```

---

.question[
Based on the percentages we calculated, does there appear to be a relationship 
between country and beliefs about climate change? If yes, could there be another variable that explains this relationship?
]

.pull-left[

```r
all
```

```
## [1] 0.4517869
```

```r
gb
```

```
## [1] 0.5101041
```

```r
us
```

```
## [1] 0.3803451
```
]

---

## Conditional probability

**Notation**: `\(P(A | B)\)`: Probability of event A given event B

- What is the probability that it will be unseasonably warm tomorrow?
- What is the probability that it will be unseasonably warm tomorrow, given that it was unseasonably warm today?

---

## Independence

- If knowing event A happened tells you something about event B happening, or vice versa, then events A and B are not independent

- If not, they are said to be independent

- `\(P(A | B) = P(A)\)`
    </textarea>
<style data-target="print-only">@media screen {.remark-slide-container{display:block;}.remark-slide-scaler{box-shadow:none;}}</style>
<script src="https://remarkjs.com/downloads/remark-latest.min.js"></script>
<script>var slideshow = remark.create({
"ratio": "16:9",
"highlightLines": true,
"highlightStyle": "solarized-light",
"countIncrementalSlides": false
});
if (window.HTMLWidgets) slideshow.on('afterShowSlide', function (slide) {
  window.dispatchEvent(new Event('resize'));
});
(function(d) {
  var s = d.createElement("style"), r = d.querySelector(".remark-slide-scaler");
  if (!r) return;
  s.type = "text/css"; s.innerHTML = "@page {size: " + r.style.width + " " + r.style.height +"; }";
  d.head.appendChild(s);
})(document);

(function(d) {
  var el = d.getElementsByClassName("remark-slides-area");
  if (!el) return;
  var slide, slides = slideshow.getSlides(), els = el[0].children;
  for (var i = 1; i < slides.length; i++) {
    slide = slides[i];
    if (slide.properties.continued === "true" || slide.properties.count === "false") {
      els[i - 1].className += ' has-continuation';
    }
  }
  var s = d.createElement("style");
  s.type = "text/css"; s.innerHTML = "@media print { .has-continuation { display: none; } }";
  d.head.appendChild(s);
})(document);
// delete the temporary CSS (for displaying all slides initially) when the user
// starts to view slides
(function() {
  var deleted = false;
  slideshow.on('beforeShowSlide', function(slide) {
    if (deleted) return;
    var sheets = document.styleSheets, node;
    for (var i = 0; i < sheets.length; i++) {
      node = sheets[i].ownerNode;
      if (node.dataset["target"] !== "print-only") continue;
      node.parentNode.removeChild(node);
    }
    deleted = true;
  });
})();
// add `data-at-shortcutkeys` attribute to <body> to resolve conflicts with JAWS
// screen reader (see PR #262)
(function(d) {
  let res = {};
  d.querySelectorAll('.remark-help-content table tr').forEach(tr => {
    const t = tr.querySelector('td:nth-child(2)').innerText;
    tr.querySelectorAll('td:first-child .key').forEach(key => {
      const k = key.innerText;
      if (/^[a-z]$/.test(k)) res[k] = t;  // must be a single letter (key)
    });
  });
  d.body.setAttribute('data-at-shortcutkeys', JSON.stringify(res));
})(document);
(function() {
  "use strict"
  // Replace <script> tags in slides area to make them executable
  var scripts = document.querySelectorAll(
    '.remark-slides-area .remark-slide-container script'
  );
  if (!scripts.length) return;
  for (var i = 0; i < scripts.length; i++) {
    var s = document.createElement('script');
    var code = document.createTextNode(scripts[i].textContent);
    s.appendChild(code);
    var scriptAttrs = scripts[i].attributes;
    for (var j = 0; j < scriptAttrs.length; j++) {
      s.setAttribute(scriptAttrs[j].name, scriptAttrs[j].value);
    }
    scripts[i].parentElement.replaceChild(s, scripts[i]);
  }
})();
(function() {
  var links = document.getElementsByTagName('a');
  for (var i = 0; i < links.length; i++) {
    if (/^(https?:)?\/\//.test(links[i].getAttribute('href'))) {
      links[i].target = '_blank';
    }
  }
})();
// adds .remark-code-has-line-highlighted class to <pre> parent elements
// of code chunks containing highlighted lines with class .remark-code-line-highlighted
(function(d) {
  const hlines = d.querySelectorAll('.remark-code-line-highlighted');
  const preParents = [];
  const findPreParent = function(line, p = 0) {
    if (p > 1) return null; // traverse up no further than grandparent
    const el = line.parentElement;
    return el.tagName === "PRE" ? el : findPreParent(el, ++p);
  };

  for (let line of hlines) {
    let pre = findPreParent(line);
    if (pre && !preParents.includes(pre)) preParents.push(pre);
  }
  preParents.forEach(p => p.classList.add("remark-code-has-line-highlighted"));
})(document);</script>

<script>
slideshow._releaseMath = function(el) {
  var i, text, code, codes = el.getElementsByTagName('code');
  for (i = 0; i < codes.length;) {
    code = codes[i];
    if (code.parentNode.tagName !== 'PRE' && code.childElementCount === 0) {
      text = code.textContent;
      if (/^\\\((.|\s)+\\\)$/.test(text) || /^\\\[(.|\s)+\\\]$/.test(text) ||
          /^\$\$(.|\s)+\$\$$/.test(text) ||
          /^\\begin\{([^}]+)\}(.|\s)+\\end\{[^}]+\}$/.test(text)) {
        code.outerHTML = code.innerHTML;  // remove <code></code>
        continue;
      }
    }
    i++;
  }
};
slideshow._releaseMath(document);
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
  var script = document.createElement('script');
  script.type = 'text/javascript';
  script.src  = 'https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-MML-AM_CHTML';
  if (location.protocol !== 'file:' && /^https?:/.test(script.src))
    script.src  = script.src.replace(/^https?:/, '');
  document.getElementsByTagName('head')[0].appendChild(script);
})();
</script>
  </body>
</html>
